Models for Mirror Symmetry Breaking via /3-Sheet - Controlled Copolymerization: 
(i) Mass Balance and (ii) Probabilistic Treatment 

Celia Blanco^ 'B and David Hochberg^'0 

^ Centra de Astrobiologia ( CSIC-INTA ), Carretera Ajalvir Kilometro 4-, 28850 Torrejon de Ardoz, Madrid, Spain 

Experimental meciianisms that yield the growth of homochiral copolymers over their heterochiral 
counterparts have been advocated by Lahav and coworkers. These chiral amplification mechanisms 
proceed through racemic /?-sheet controlled polymerization operative in both surface crystallites as 
well as in solution. We develop two complementary theoretical models for these template-induced 
desymmetrization processes leading to multi-component homochiral copolymers. First, assuming 
' reversible /3-sheet formation, the equilibrium between the free monomer pool and the polymer 

strand within the template is assumed. This yields coupled non-linear mass balance equations 
■ whose solutions are used to calculate enantiomeric excesses and average lengths of the homochiral 

' chains formed. The second approach is a probabilistic treatment based on random polymerization. 

^ , The occlusion probabilities depend on the polymerization activation energies for each monomer 

^ . species and are proportional to the concentrations of the monomers in solution in the constant pool 

Q' approximation. The monomer occlusion probabilities are represented geometrically in terms of unit 

I simplexes from which conditions for maximizing or minimizing the likelyhood for mirror symmetry 

QQ ■ breaking can be determined. 



I. INTRODUCTION 

O ' Mirror or chiral symmetry is broken in all known biological systems, where processes crucial for life such as replica- 
tion, imply chiral supramolecular structures, sharing the same chiral sign (homochirality). These chiral structures are 
I proteins, composed of aminoacids almost exclusively found as the left-handed enantiomers (S), also DNA, and RNA 
, polymers and sugars with chiral building blocks composed by right-handed (R) monocarbohydrates. The emergence 

of this biological homochirality in the chemical evolution from prebiotic to living systems is an enticing enigma in 
the origin of life and early evolution and is a compelling problem that foments scientific activity transcending the 
^ I traditional boundaries of physics, chemistry and biology [1]. 

0^ ■ Biological homochirality of living systems involves large macromolecules, therefore a key issue is the relationship 
of the polymerization process with the emergence of chirality. This problem has generated activity in theoretical 
mode ling aimed at understanding mirror symmetry breaking in chiral polymerization. Most of the models proposed 
3, l4l4lC| can be understood as elaborated extensions and generalizations of Frank's original paradigmatic scheme 
■ 3|. An early work is that of Sandars [H, who introduced a detailed polymerization process plus the basic elements 
I of enantiomeric cross inhibition as well as a chiral feedback mechanism in which only the largest polymers formed 
can enhance the production of the monomers from an achiral substrate. He provided basic numerical studies of 
symmetry breaking and bifurcation properties of this model for various values of the number of repeat units N. The 
_ ^ subsequent models cited below are actually variations on Sandars' original theme. Thus, Brandenburg and coworkers 
^> ' studied the stability and conservation properties of a modified Sandars' model and introduce a reduced N = 2 
^ , version including the effects of chiral bias. They included spatial extent Q in this model to study the spread and 
propagation of chiral domains as well as the infiuence of a backround turbulent advection velocity field. The model 
of Wattis and Coveney [6j differs from Sandars' in that they allow for polymers to grow to arbitrary lengths N and 
the chiral polymers of all lengths, from the dimer and upwards, act catalytically in the breakdown of the achiral 
source into chiral monomers. An analytic linear stability analysis of both the racemic and chiral solutions is carried 
out for the model's large N limit and various kinetic timescales are identified. The role of external white noise on 
Sandars-type polymerization networks including spatial extent has been explored by Gleiser and coworkers: the N — 2 
truncated model cited above[3| is subjected to external white noise 0, chiral bias is then considered as well as 
the infiuence of high intensity and long duration noise Q. Modified Sandars-type models with spatial extent and 
external noise [To| are considered, allowing for both finite and infinite N, with an emphasis paid to the dynamics 
of chiral symmetry breaking. On the experimental side, Luisi et al. (Til reported the polymerization of racemic 
tryptophan, leucine, or isoleucine in buffered solutions which yielded libraries of short oligopeptides in the range of 
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six to ten residues, where the isotactic peptides were formed as minor diastereoisomers, in amounts larger than those 
predicted by a purely random binomial distribution. 

One scenario for the transition from prebiotic racemic chemistry to chiral biology su gges ts that homochiral peptides 
or amino acid chains must have appeared before the onset of the primeval enzymes |13l4l7| . However, except for a 
couple of known cases pTl Il2j. the polymerization of racemic mixtures (i.e., in 1:1 proportions) of monomers in ideal 
solutions typically yields chains composed of random sequences of both the left and right handed repeat units following 
a binomial distribution fT] . This statistical problem has been overcome recently by the experimental demonstration of 
the generation of amphiphilic peptides of homochiral sequence, that is, of a single chirality, from racemic compositions 
or racemates. This route consists of two steps: (1) the formation of racemic parallel or anti-parallel /3-sheets either in 
aqueous solution or in 3-D crystals jT9| during the polymerization of racemic hydrophobic a-amino acids (Figure [T]) 
followed by (2) an enantioselective controlled polymerization reaction [20- 26]. This process leads to racemic or mirror- 
symmetric mixtures of isotactic oligopeptides where the chains are composed from amino acid residues of a single 
handedness (see Figure [T]) . Furthermore, when racemic mixtures of different types of amino acids were polymerized, 
isotactic co-peptides of homochiral sequence were generated. The guest (S) and (R) molecules are enantioselectively 
incorporated into the chains of the (S) and (R) peptides, respectively, however the guest molecules are randomly 
distributed within the corresponding homochiral chains, see Figure O 
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FIG. 1: Self assembly of oligopeptides into racemic /^-sheets, for the case of a sin gle species (_Ro, ^o) of amino acid supplied in 
ideally racemic proportions. For a full experimental account, see Weissbuch et al.[19l]. 



As a combined result of these two steps, the sequence of pairs of co-peptide S and R chains within the growing 
template will differ from each other, see Figure[2] This results in non-racemic mixtures of co-peptide polymer chains of 
different sequences. Consequently, by considering the sequences of the peptide chains, a statistical departure from the 
racemic composition of the library of the peptide chains is created which varies with chain length and with the relative 
concentrations of the monomers used in the polymerization pll . [22j | . This can be appreciated comparing Figure [T] 
and Figure [2] in the former the /3-sheet is globally racemic (no guest amino acids) whereas the latter template is not 
by virtue of the randomness of the specific amino acid sequences within each homochiral strand, due to the presence 
of guest amino acids. It is precisely here, in the /3 — sheet template, that mirror symmetry is stochastically broken. 
Non-enantiomeric pairs of homochiral chains are formed; this mechanism relies crucially on the presence of more than 
one type of amino acid. Note that this does not necessarily imply any net optical activity of solution containing the 
remaining free chiral monomers. 

In this paper we report a theoretical investigation of multi-component copolymerization controlled by such tem- 
plates. The models we introduce presuppose or take as given the prior formation of the initial templates or /3-sheets 
and is concerned exclusively with the subsequent enantioselective polymerization reactions. Thus the nonlinear tem- 
plate control is implicit throughout our discussions. We consider two distinct model approaches to the problem. 
The first is based on detailed balance where the polymerization proceeds through stepwise isodesmic additions and 
dissociations of the chiral monomers (amino acids) to one end of the growing homochiral chain within the template. 
This can be treated knowing the compositions of the majority and minority monomers and their associated equi- 
librium constants, and we use chemical kinetics at equilibrium as a useful approximation to completely solve the 
problem. In thermodynamic equilibrium detailed balance allows us to derive coupled sets of nonlinear mass balance 
equations. Their solutions yield the equilibrium concentrations of all the monomers and chiral copolymers in terms 
of the equilibrium constants and the initial total monomer compositions. With these in hand, we can calculate the 
enantiomeric excesses of the homochiral chains and their average lengths. The degree of mirror symmetry breaking 
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depends on the numbers of the monomer components and their relative concentrations |27| . A brief communication 
of preliminary results obtained from this equilibrium model were reported recently by the authors ^27j. This paper 
extends and generalizes that previous work. 

The second model approach is based on strictly probabilistic or statistical considerations and does not assume 
chemical equilibrium. We will consider the general case involving racemic mixtures of various species of enantiomers, 
that is, a variety of racem ic g uest molecules, which can occlude randomly into the chiral sites of the host racemic 
/3-sheet or crystal site [2l|, [2^, [2^1 • Among the questions to be addressed: how many species m of such guests 
are needed to break mirror symmetry? How many repeat units N should the homochiral chains have? What are the 
ideal mole fractions of the monomers in solution for symmetry breaking? To answer these questions we first calculate 
the probability that a given homochiral sequence is formed from the majority and minority species. For random 
copolymerization, the attachment probability, or the probability of occlusion by the template/crystal, of an amino 
acid monomer to the growing chain is proportional to its concentration in solution, and we will invoke the constant 
pool approximation. This probability depends on the polymerization activation energy of the individual monomers 
(through the Arrhenius relation). The second part deals with combinatorics: counting the number of rearrangements 
of a given sequence, as all these independent sequences or "re-shufflings" will have the same probability to form. The 
information from both these parts permits us to calculate the joint probability of finding enantiomeric pairs and from 
this we deduce the net probability for finding non- enantiomeric pairs. The latter provides a statistical measure of 
the likelihood that mirror symmetry is broken as a function of chain length and the number and concentration of the 
minority species. We will then generalize these arguments to the case of many additives, and even allow nonracemic 
initial concentrations for all the amino acid species. Here again, the underlying kinetic template control is assumed 
implicitly. 

Both approaches assume that the template-controlled polymerization obeys a first-order Markov process. Experi- 
ments carried out in solution [IT, T^l appear to confirm this expectation: these results were subsequently rationalized 
by a mathematical model assuming a first-order Markov mechanism [28| . 

These two theoretical perspectives afford a complementary view of the induced mirror symmetry breaking scenario 
as originally proposed by Lahav and coworkers. The first scenario holds for closed systems in equilibrium where the 
monomers are depleted/replenished by the polymerization. We can nevertheless approximate irreversible polymer- 
ization as well as we please by simply choosing sufficiently large equilibrium constants. The numerical effects are 
negligible. The second approach is apt for open systems where the monomer pool is held constant and is free from 
the assumption of equilibrium. 

II. THEORETICAL METHODS I 
A. Mass Balance 

To address the general setting for the generation of libraries of diastereoisomeric mixtures of peptides as originally 
proposed by Nery at al.[2l|, we need a suitable generalization of their scenario. To this end we consider the case 
where we have a majority amino acid species {Ro,So) and a given number m > 1 of minority amino acid species 
Si), (i?2j 'S'2), ■■■{Rrm Sm)- Since the following calculations are based on chemical equilibrium and detailed balance, 
if all (m -I- 1) species are supplied in strictly 1:1 racemic proportions, we would justifiably expect a racemic outcome, 
that is, no mirror symmetry breaking. However, we can test the model's ability for chiral amplification by considering 
unequal initial proportions for the m minority species in solution. That is, does the enantiomeric excess ee increase as 
a function of chain length, and is it greater than the initial ee of the monomers? The three monomer case originally 
treated[2l| is a specific example of this for m — 1 and with i?i = 0, that is the system contains Rq,So and only 
the enantiomer 5*1 of the guest species. We assume as given the prior formation of the initial templates or /^-sheets, 
and are concerned exclusively with the subsequent enantioselective random polymerization reactions (step (2)). The 
underlying nonlinear template control is implicit throughout the discussion. We consider stepwise additions and 
dissociations of single monomers from one end of the (co)polymer chain, considered as a strand within the /3-sheet, 
see Figure [21 It is reasonable to regard the /3-sheet in equilibrium with the free monomer pool[32| *. 

From detailed balance, each individual monomer attachment or dissociation reaction is in equilibrium. This holds 



[43] * Ref. I32II reports a stochastic simulation of two concurrent orthogonal processes: 1) an irreversible condensation of activated amino 
acids and 2) reversible formation of racemic /3-sheets of alternating homochiral strands. The two steps taken together comprise a two- 
dimensional formulation of the problem. These architectures lead to the formation of chiral peptides whose isotacticity increases with 
length. 



4 



for closed equilibrium systems in which the free monomers are depleted/replenished by the templated polymer- 
ization. Then we can compute the equilibrium concentrations of all the (co)-polymers in terms of equilibrium 
constants Ki for each individual amino acid and the free monomer concentrations. The equilibrium concentra- 
tion of an S'-type copolymer chain of length no + ni + n2 + ■■■ + n„i = N made up of rij molecules type Sj 
is given by Pno,ni,...,nm ~ {KoSo)"^" (KiSi)"'^ ...{KmSm)^'" / Kq, where sj = [Sj] [2gj. Similarly for the concentra- 
tion of an i?-type copolymer chain of length Uq + n'l + n'2 + ■■■ + rt,'„ = N made up of rij molecules of type Rj: 

p^, — {Koro)"o(^Kiri)"^ ...{Kmrm)""^ /Ko, where rj — [Rj]. Note that we are considering only copolymers 

with random sequences such as RO - RO - Rl - RO - RO - R2 - RO - .... and SO - SO - SI - SI - SO - S2 - SO - 
but not heterochiral polymers (that is, no sequences involving both the S and R type monomers.) The equilibrium 
concentration equations we write down p^^ „^ n^jPn' n' n' implicitly assume the underlying template control. 




FIG. 2: The proposed scheme leading to enantioselective occlusion within racemic /3-sheet templates. For the case illustrated, 
a majority host species {Ro, So) and two minority guest species {Ri, Si) and {R2, S2) of amino acids all of which are provided 
in ideally racemic proportions. The amino acids of a given chirality attach to sites of the same handedness within the growing 
P sheet leading to the polymerization of oligomer strands of a uniform chirality, and in the alternating row S — R — S — R — ... 
fashion as depicted. Since the polymerization in any given row is random and the guest monomers are typically less abundant 
than the host species, the former will occlude in a random way leading to independent uncorrelated random sequences in 
each chiral strand. The overall process yields non- enantiomeric pairs of homochiral copolymers, so that mirror symmetry 
is broken in a stochastic manner. The corresponding mass balance equations Eq. [S] are obtained assuming the monomer 
attachment/dissociation is in chemical equilibrium. 



The number of different S'-type copolymers of length I with rij molecules of type Sj, for < j < m species, is given 
by the multinomial coefficient: 



no, ni, n,„ J nolnil...nm^. 
Hence the total concentration of the S'-type copolymers of length / within the /3-sheet is given by 



Pf 



E ( no,ni,^.,n„)^"o.n„...,„,„ = ;^(AVo + i^i.i + ... + i^™.™)', (2) 



no+ni + ...+nm=l 



which follows from the multinomial theorem [30| . From this we can calculate the number of each type Sj of S'- monomer 
present in the S'-copolymer of length equal to I, for any < j < m: 

^M') = E (no,ni,'...,n„J"^^«o,ni,...,«™ =^J^pf 

no+ni + ...+n^=l ^ ^ ■> 

K 

= -^Sjl{KoSo + KiSi + ... + K„,s^y~\ (3) 
R-a 
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Then we need to know the total amount of the S-type monomers bound within the S'-copolymers (in the /3-sheet) 
from the dimer on up to a maximum chain length N. Using Eq. ^ for the jth type of amino acid, this is given by 

^Mot)-f:sApf)^^s^f^, (4) 

1=2 \ ) 

the final expression holds in the limit N od provided that a — {Koso + Kisi + ... + KmSm) < 1- This must 
be the case, otherwise the system would contain an infinite number of molecules [29| . Similar considerations hold 
for the i?-sector, and the total amount of R monomers inside R copolymers for the jth amino acid, is given by 
Tj{p^g^) = -^Tj ^(^"^j'a where h = {K^r^ + i^i^i + ... + Kmrm) < 1- From this we obtain the mass balance equations 
which hold for both enantiomers S, R of the host and guest amino acids, and is our key result [27]: 

^ a(2-a)_ K, b{2-b) _ 

These equations express the fact that each type of enantiomer is either free in solution, or is else bound inside a 
(co)polymer strand within the template. 

The problem then consists in the following: given the total concentrations of all the m+l host plus guest enantiomers 
{^Jtof ^JtotJ'j=0' ^^^'^ equilibrium constants Ki, we calculate the free monomer concentrations in solution {sj, ?'j"}jLo 
from solving the nonlinear equations Eqs. ([S]). Denote by sotot + ■•■ + Smtot + '^Otot + •■• + "rmtot = ^tot the total system 
concentration. From the solutions of Eq. ([5]) we can calculate e.g., the equilibrium concentrations of homochiral 
copolymers p^^ „^ and p^, ^, of any specific sequence or composition as well as the resultant enantiomeric 

excess for homochiral chains of length I composed of the host (majority) amino acid: 

[roY + (so)' 

At this juncture it is important to point out that our above approach assumes that the polymerization reactions are 
under thermodynamic control. If there are any kinetic effects, they will not be seen as they would contribute to the chain 
compositions at shorter (finite) time scales. Our aim here is to obtain the compositions at asymptotically long relaxation 
times, and we thus hypothesize that the dominant pathways are under thermodynamic control. 



B. Average chain lengths 



We can calculate the average copolymer chain lengths as functions of initial monomer compositions Sj^^^,rj^^^, and 
the equilibrium constants Kj, using the solutions of our mass balance equations Eq. ([5]). 

The ensemble-averaged chain lengths afford an alternative measure of the degree of mirror symmetry breaking 
resulting from the desymmetrization process discussed in Nery el al.(2]|. There are a number of relevant and interesting 
averages one can define and calculate. The average chain lengths, starting from the dimers, of the S'-type copolymers, 
composed of random sequences of the Sj type monomers, and that of the _R-type copolymers composed of random 
sequences of the Rj type monomers are given by: 



< Is > 



El2iMpf) + slipf) + ■■■ + ^™(pf )) . + ^-^1 + - + ^*'") 



, a{2~a) 



1^1=2 Pi 



(1- 



1 - a/ 



(7) 



<Ir> ^ wr-i ' -[r^l' 

1^1=2 Pl {l-b)Ko 



respectively. We also obtain an expression for the average length of the polymer chains composed exclusively from 
sequences of the Sj or Rj enantiomers of a given specific amino acid of type j: 



< n > 



22l=2Sj{Pl ) _ Z.i=2 J^i^SjH-f^jSj) ^ {1-K,s,y^ _ 2-KjS^ 



1^1=2 Pi 1^1=2 Kg (l-KjSj) ■' ■' 
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TABLE I: The definitions of the various average chain lengths < I >, < Is >, < Ir >, < > and < > employed. 

< / > Average length of all the copolymers in the system 

< Is > Average length of all the S-type copolymers, composed of random sequences of the Sj type monomers 

< Ir > Average length of all the J?-type copolymers, composed of random sequences of the Rj type monomers 

< ly > Average length of the polymer exclusively composed from sequences of the Sj enantiomers of a given 

amino acid type j 

< fjl > Average length of the polymer exclusively composed from sequences of the Rj enantiomers of a given 

amino acid type j 



1^1=2 Pi 1^1=2 Ko (l-KjTj) ■' 

To complete the list, we can calculate the chain length averaged over all the copolymers in the system: 

E^l2(so(pf ) + Sl{pf) + ... + S,n{pf) + ro{pf) + n{pf) + ... + Tmipf)) 



< I > 



K + l^^i + - + + (^0 + + ... + f^.^)^ 

q2 , fc2 



{l-a)Ko ^ {l-b)Ko 

a?{2 - a)(l - 6)' + b^2 - b){l - 
a2(l - 6)2(1 -a)+ 62(1 - 6)(1 - a)2 ' 



(11) 



The right-hand most expressions (— >) in each case hold in the limit of TV — > cjo and for a < 1 and 6 < 1. See Table U 
for definitions of all these quantities. 



III. RESULTS 



A. Induced desymmetrization 



We turn to the scenario discussed in Nery et al. [21| and consider the influence of a single guest species, so m = 1 
will be sufficient for our purposes. For a single guest, we drop numbered indices and denote the majority host species 
and concentrations by r = [R], s ~ [S] and the minority guest with a prime; s' = [S']. 

We use the above framework to calculate the enantiomeric excess ee as a function of chain length / for the three 
starting compositions of the monomer crystals as reported |2l|. In Fig. [3] we plot the numerical results obtained from 
calculating Eq. [6l the only quantities required for this are the solutions of r and s obtained from solving the set of 
equations Eq. ([S]) . For strictly illustrative purposes only, we set the equilibrium constants to be the same for both host 
and guest monomers Ki = Kq = K ^ lOOOM"^, the total initial concentration, Ctot — 10~^M; the initial fractions 
of each component are denoted by / = {fr,fs,fs'} and obey fr + fs + fs' — 1- The starting composition of the 
mixture is Ctot = rtot + Stot + s'^^tJ and the total amount of each component is: 1^*^= ctot * fr, Stot = Ctot * fs, and 
SjQj = Ctot * fs'- We can appreciate the induced symmetry breaking mechanism |2l| from the behavior of ee;. For 
the first case fr : fs '■ fs' = 0.5 : 0.25 : 0.25, mirror symmetry is broken for almost all the chain lengths, even for 
small values of /: for / = 3 the ee reaches 60% and for / = 5 the ee is found to be greater than 80%, this is due to 
the equal starting fractions of the majority stot and the guest Sj^j monomer species of the same chirality, the large 
amount of guest is the reason for these large values of ee. For the second case fr ■ fs '■ fs' = 0.5 : 0.45 : 0.05, the 
starting fraction of the majority species, s'^^f., is almost 10 times (0.45/0.05 = 9) greater than that of the guest, s', 
so for the enantiomeric excess to be greater than 60% the chain length must be at least I — 13, and for obtaining an 
ee of 80%, the chain length must be at least / — 20. Finally, for the third case, fr ■ fs ■ fs' = 0.5 : 0.475 : 0.025, the 
starting fraction of the majority species, Stot, is almost 20 times (0.475/0.025 = 19) greater than that of the guest, 
S(o(, thus the enantiomeric excess for each chain length is expected to be much less than for the two previous cases, an 
ee greater than 60% is found for the chain length I = 27 and for reaching greater than 80%, the chain length must be 
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at least I = 42. For the three cases, an increase of the eei is observed (for ah I) when increasing the starting fraction 
of the guest species, sj^j. When s^^j is comparable to Stot, while maintaining the proportion i?-type:S'-type=l:l, then 
symmetry breaking is ensured to be > 40% for alH > 5. 
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FIG. 3: Calculated ee values from solving Eqs. ([5]) for m — 1 guest monomer and three different starting monomer compositions 
(in relative proportions) rtot stot s't^t = 0.5 : 0.25 : 0.25 (filled circles), 0.5 : 0.45 : 0.05 (squares) and 0.5 : 0.475 : 0.025 
(triangles) for the equilibrium constant Kq = Ki = lOOOM"^ and the total monomer concentration ctot = 10~^M. Compare 
to Fig. 13 of Nery et al.[2l[. 




The mass balance equations can be used for calculating the amount of free monomers in solution as well as the 
amounts of the monomers bound inside the polymers as functions of the starting compositions and K. Solving 
Eq.® yields the amounts of the free monomers, given by (r, s, s'), while the amounts of the R, S and S' monomers 
inside the copolymers are given by the expressions Vpoiy — rb{2 — b)/{l — b)^, Spoiy = sa{2 — a) /{I — a)^ and 
s'pgiy = s'a{2 — a)/(l — a)^, respectively. Then the total amount of all the monomers in polymers is given by 
Cpoiy = Tpoiy + Spoly + Sp^iy, and the total amount of free monomers in solution is Cfree = r + s + s' . In Fig. |3]we 
display the values of these quantities for the same three starting compositions considered above as a function of Ctot 
and for Kq = Ki = lOOOM"^. The first row of Fig. 2] indicates how the amount of free monomers in solution Cfree, 
is greater than the amount of those in polymers, Cpoiy, for values of ctot below a critical value. Above this value, then 
Cpoiy > Cfree'- that is, the majority of the monomers are to found in the polymers, not in solution. In the second 
row, the different contributions to Cpoiy are plotted for each type of monomer. In the first case Stot ■ s'tot = 1:1 and 
leads to Spoiy '■ s'^oiy = 1^1 which is the most favorable case for mirror symmetry breaking. Increasing the starting 
ratio between stot and Sj^j, increases the difference between Spoiy and s'p^iy, and diminishes the degree of symmetry 
breaking. The curves for Spoiy approach that of Vpoiy as s'^^^ is diminished (from left to right in Fig. U). Hence in 
the third case, where Stot '■ Sjot = 0.475 : 0.025, almost all the monomers present in copolymers are the S monomers. 
The same applies for the third row, where the different contributions to Cfree are plotted. Both the amounts of free 
monomers and those forming polymers increase when increasing ctot- The degree of mirror symmetry breaking can 
be visualized by the gap or vertical distance between the curves for r^^ee and s/ree versus ctot and as the amount of 
^'free varied. In a similar way Fig. [5] displays the same quantities for fixed Ctot — 10"'^ Af and as functions of the 
equilibrium constant K. As before, Cpoiy and its individual contributions all increase with increasing K, whereas Cfree 
(and its individual contributions) all decrease. Clearly, increasing K favors the formation of the polymers over their 
dissociation into free monomers, and we can approximate irreversible polymerization as close as we please by taking 
sufficiently large values of K. 

The equilibrium concentration of the S-type copolymer chain of length m + n = N made up of m molecules S and of 
n molecules S' is given by „ = [K s)"^ {K s')^ / K . Using the solutions from Eqs. ([5]) we compute the mole fractions 

pS 

of each S-copolymer, normalized to its own subfamily as — , this is displayed in Fig. [6] for 2 < N < 7. 

By way of one further example, we carry out a similar analysis for the case of four monomers, this time for two 
majority R, S and two minority amino acids: R', S' . From Eq.® we calculate the ee/ for the different chain lengths 
I for three different starting compositions. In Fig. [7] we show the numerical results obtained from the solutions of the 
set of equations Eq.© and Eq.®, for Kq = Ki ^ 1000M~i and Stot + s'tot + not + r't^t = 10"^M. 

As before, we can evaluate the mole fractions of both the S and R-type copolymers that are in equilibrium with 

the free monomer pool: namely ^ — and ^ — , respectively. These are displayed in Fig. |S]for the 
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FIG. 4: ATo = K\ = 1000. The amounts of free monomers and those bound in polymers as function of total monomer 
concentration Ctot versus three different initial relative proportions: rtot ■ Stot '■ s' tot = 0.5 : 0.25 : 0.25, rtot ■ Stot '■ s' tot = 0.5 : 
0.45 : 0.05 and rtot ■ Stot '■ s' tot = 0.5 : 0.475 : 0.025. The first row: the total amount of free monomers and those forming 
polymers. Second row: the total and individual amounts of monomers forming polymers. Third row: the total and individual 
amounts of free monomers. See text for discussion. 



initial total compositions indicated there. 

Figures [3] and |7] clearly demonstrate that the higher (lower) is the initial degree of chiral asymmetry, characterized by 
rtot I Stot in the former and T[^tls'tot ''^ the latter, the higher (lower) is the final asymmetry. Thus, rather than symmetry 
breaking per se, we are observing the model's capacity for asymmetric amplification, as stated at the beginning of Section 
III Al Nevertheless, effects closer to a symmetry breaking effect can be appreciated by looking at the average chain lengths 
for unequal equilibrium constants in Sec II 1 1 Bl 



B. Average lengths of copolymer chains 

As an application of the mean chain length formulas derived in Eqs. ([Tl fTT]) in the following we focus on the simplest 
case of m = 1 guest. We consider the effect of different equilibrium constants J-iTo 7^ Ki and a small total system 
concentration ctot = 10~^Af in Table |TT1 The dependence on varying ctot for fixed but distinct equilibrium constants 
Kq ^ Ki is displayed in Table [ml 

Most interestingly, in Table 2 and 3, one can see the evolution of the global r/{s + s') asymmetry, by looking at the 
< Is > / < Ir. > difference. Especially from the results for the 0.5 : 0.25 : 0.25 case, i.e. starting from a symmetric 
r/(s + s') state, some chiral asymmetry, albeit small, is obtained between the length of the all-R and all-S copolymers. 
The source of this asymmetry is the ratio Kq/Ki of the equilibrium constants, which we set to 2 in these examples. By 
contrast, when Kq = Ki there is then no difference between < Ifj > and < Is >■ Conversely, greater ratios of Kq/Ki 
lead to greater differences in < > and < Is > (data not shown). 

Finally Tables IIVI and |V] have been calculated for the same starting compositions as Figure [7] and can be compared 
with the latter. 
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FIG. 5: ctot ~ 10~^.The amounts of free monomers and those in polymers as function of the equilibrium constant K versus 
three different initial relative proportions: rtot stot : s'tot = 0.5 : 0.25 : 0.25, rtot ■ Stot : s'tot ~ 0.5 : 0.45 : 0.05 and 
Ttot '■ Stot '■ s'tot ~ 0.5 : 0.475 : 0.025. The first row: the total amount of free monomers and those forming polymers. Second 
row: the total and individual amounts of monomers forming polymers. Third row: the total and individual amounts of free 
monomers. See text for discussion. 



TABLE 11: Average chain lengths for the three different starting compositions as a function of Ko for Ki = Ko/2 and 
Ctot = 10-=*M 





rtot 


: Stot ■ 


Sfot — 


0.5 : 0.25 


0.25 


rtot 


: Stot ■ 
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0.5 : 0.45 


0.05 


rtot 


Stot 


s'tot = 0.5 
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3.79 


3.58 


2.03 


100000 


8.57 


8.56 


8.59 


2.78 


2.75 


8.59 


8.58 


8.59 


5.60 


2.10 


8.59 


8.59 


8.59 


6.73 


2.04 



TABLE 111: Average chain lengths for the three different starting compositions as a function of ctot for Ko ~ 100000 and 
Ki = Ko/2 
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FIG. 6: Mole fractions of the S-type copolymers composed of n S and m 5" monomers for ra n = N with 2 < A'^ < 7, as 
functions of the total initial fractions rtot '■ stot '■ Stot- For Stot : Stot = 1 • li the distributions are binomial (top), but when 
Stot '■ s^ot = 9:1 (center) or when Stot '■ Sjot = 19 : 1 (bottom), then the distributions are skewed. 
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FIG. 7: Calculated ee values from solving Eqs. ([5]) for three different starting monomer compositions (in relative proportions) 
rtot ■■ r't^t ■■ Stot ■■ s't^t = 0.3 : 0.1 : 0.3 : 0.3 (filled circles), 0.3 : 0.15 : 0.3 : 0.25 (squares) and 0.3 : 0.18 : 0.3 : 0.22 (triangles) for 
the equilibrium constant Ko = Ki — lOOOM"^ and the total monomer concentration ctot = 10~'^M. 



IV. THEORETICAL METHODS II 



A. Probabilistic approach 



In the following sections, we adopt a statistical approach for calculating the likelihood for finding non- enantiomeric 
pairs of copolymers formed by the proposed template mechanism. This approach does not require chemical equilib- 
rium. How many species m of the chiral guest monomers are needed to break mirror symmetry? How many repeat 
units N should the chains have? Are there conditions on the polymerization activation energies and mole fractions of 
the monomers in solution for maximizing the mirror symmetry breaking? We provide answers to these questions based 
on statistics, and this means being able to count polymer configurations, distinguishing sequences from compositions, 
and applying some basic combinatorial analysis. Indeed, we may regard the specific homochiral copolymerization 
sequences formed within the template mechanism as outcomes or "tosses" of generalized multifaceted "dies" (e.g., 
see Figure [2]). However, these dies are loaded, in the sense that not all faces of the generalized die have an equal 
probability of turning up in any given throw. This is because different amino acids have different polymerization 
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FIG. 8: Top set: mole fractions of the S-type copolymers (n, m) for m + n — N with 2 < N < 7 (left to right in each row), as 
functions of the total initial fractions rtot '■ r't^t ■ Stot ■ s'tot indicated there. Bottom set: mole fractions of the R-type copolymers 
(n, m) for m + n = N with 2 < N < 7. Since stot : Stot = 1 : 1 in the first row of top graph, the distributions are binomial, but 
are skewed in all the other cases displayed. 



TABLE IV: Average chain lengths for the two different starting compositions as a function of Kq for Ki = Kq/2 and 
ctot = 10-^1 
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2.89 
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activation energies and may be present in solution in different proportions. 

To resolve this problem, we must pay special attention to both the overall composition of the copolymer chain as 
well as its specific sequence. The problem has two basic parts: one is concerned with calculating the probability that 
a given amino-acid sequence is formed from the majority species and whatever minority species are present in their 
respective mole-fractions in solution. The attachment probability (the probability that the host template occludes 
this monomer) of an amino acid monomer to the growing chain is proportional to its concentration in solution and to 
a factor depending on its polymerization activation energy. The second part is to count the number of rearrangements 
or "shufflings" of the given sequence, as all these independent sequences will have the same probability to form as the 
given one. The information from both these parts will permit us to calculate the joint probability that a given sequence 
and its mirror image sequence are formed. This in turn will be used to provide a statistical measure of the likelihood 
that mirror symmetry is broken: below we derive a compact expression for the probability to find non-enantiomeric 
pairs of copolymers in the template (Figure [5]) . We first need to specify the length N of the homochiral copolymer 
chains to be formed, and the number of each minority species or additive nir, rag. Thus we consider (ro, ri, r2, r^^) 
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TABLE V: Average chain lengths for the two different starting compositions as a function of Ctot for Kq = 100000 and 
Ki = Ko/2 
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FIG. 9: Homochiral copolymer sequences and their mirror-related sequences. Top row (above the mirror) enumerates all the 
possible R-type copolymers jr of length N that can be made up from nir different R-type monomers: there are (m,- + 1)^ 
such chains. Below the mirror: mirror image related S-type homochiral copolymers 7s made up of S-type monomers. In this 
example nir < ms, so there are more S-type copolymers than R-type. Solid vertical line segment links an enantiomeric pair 
of sequences, the dotted lines represent examples of non- enantiomeric pairs of sequences. A given composition typically gives 
rise to many inequivalent but equiprobable sequences (indicated by the horizontal solid brackets). 



and (so:'5i,S2, ■•■,Sms) whereas {r,s) = (tq^sq) denotes both the enantiomers of the majority species. Following the 
experimental scenario, the minority species will typically be present in small mole fractions, whereas the majority 
species will be present with a predominantly large mole fraction, as their names suggest. The total number of possible 
sequences in a chain with N repeat units for each configuration is (m^ + 1)^ and (m^ -|- 1)^. This most general case is 
represented in a suggestive pictorial way in Fig. [3] This diagram is used to enumerate all possible chiral copolymers 
that can form in the template, laid out in a linear fashion, the totality of i?-copolymers strung out above a "mirror" 
and the mirror-related ^-copolymers directly below it. 

Statistical copolymers are those for which the sequence of monomer residues follows a statistical rule. The at- 
tachment probability is proportional to the monomer's concentration in solution [j'jljfsj] times a rate constant that 
depends on the activation energy Ej for attachment of that specific monomer to the polymer/template thus: 

pir,) cx A,exp{-E,/kT)[r,]^Wj[r,], (12) 
p{sj) cx Wj[sj]. (13) 

To obtain bona-fide probabilities, these are are normalized so that 

< p(r,) = ^Z'^'''\ , < 1, (0 < J < mr) (14) 
< p{s,) = ^Z'^''\ . < 1, (0 < J < m,) (15) 

which implies 



k=0 k=0 



(16) 
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Normalization ensures that the probability that any single monomer attaches to the template is between zero and 
unity: evidently no individual probability can be greater than one, nor can the total probability exceed unity. In 
writing down Eq. (|12|) . there are two implicit assumptions being made: (1) the rate of polymerization is independent 
of polymer length N, and (2), the probability of any given monomer joining a polymer is independent of the existing 
polymer structure. In (1) we are assuming isodesmic polymerization: the successive addition of a monomer to the 
growing chain leads to a constant decrease in the free energy, this in turn indicates that the affinity of a subunit for 
a polymer end is independent of the length of the polymer 'l8|. In (2) we assume the polymerization is a first-order 
Markov process, the attachment depends only on the nature of the terminal end of the polymer, but not on the 
monomer sequence in the chain. Evidence for kinetic Markov mechanisms has been observed experimentally in some 
chiral polymerizations fl^- 

Define the attachment probability vectors as 

PR = {p{ro),p{ri),p{r2), ...,p{rmr)} (17) 



Ps = {p{so),p{si),p{s2), •■•,P(SmJ}, (18) 

one for the i?-monomers, and one for the S'-monomers. Note that in the limit when both minority species are absent 
mr,?7T,s — >■ 0, there will be one unique sequence for each handedness. Namely, a sequence of N i?'s and a mirror 
sequence of N 5"s. These two pure sequences will each form with unit probability, since p{ro) — p{so) = 1, as follows 
from Eqs. (|16|) . An enantiomeric pair will form with absolute certainty when there are no guest additives. This limit 
provides an important check on the statistical arguments developed below. 
Chain compositions for the R and S type chains are specified as 

n(ro) + n{n) + ^(ra) + ... + n(r,„^) = N, (19) 

and 

n{so) + n{si) + n{s2) + ... + n{s„,,) = N, (20) 

where n{rj) and n{sj) denote the number of times the j-th R and S monomer occur in the corresponding chain, 
respectively. These are ordered partitions of the integer N. Many different sequences can follow from one given 
composition Eqs. (|19l20p . By means of the template controlled polymerization mechanism ^2l\, only homochiral 
chains will be formed, that means chains formed of either all right-handed R or else all left-handed S monomers, and 
these can be represented by vectors. For example, for the case of a right-handed chain: 

'yB.^{r,r,ri,r,r,r3,....,r}N, (21) 

while its mirror image related sequence is denoted by the vector 

Js ^ {s,s,si,s,s,S3,....,s}n- (22) 

We emphasize that we are comparing the sequences of copolymers made up exclusively of either all right- R or 
all left-handed S monomers, and not making any claim about their corresponding secondary or terciary structures. 
When we discuss copolymers related through the mirror as in Fig. ^ we refer exclusively to their specific monomeric 
sequences or primary structures. It enumerates all the possible sequences that can form from the given composition. 
The underlying template control is assumed implicitly, thus the system is composed of only homochiral structures, 
see Figured 

The probability to form specific sequences of length N from the compositions (|19l20p is given by the composition 
probability: 

nij. 

PhR) = Up('^)"^'^'^' (23) 



P(75) = n^'(^^)"^'^'^- (24) 

In general, there will be many distinct sequences with exactly the same composition-probability Eas. (|23l24p . see the 
horizontal solid line segments in Figure [S] These are re-shufflings or re-orderings of the given sequence, keeping the 
individual composition numbers n{rj),n{sj) fixed in (I19l20p . and the number of such equiprobable sequences will be 
calculated below. 
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1. Probability to form one enantiomeric pair 



First consider the probability to form a specific sequence, call it 7^. We fix the number of repeat units N and the 
number of i?-type additives. From the sequence we immediately deduce the composition (or composition vector) 
'^R = {'^(^0)? "'(''1)7 "'(''mr)} we specify the monomer attachment probabilities (or the attachment/occlusion 
probability vector), Eq. pT)) . Next, consider the probability to form its mirror image, that is, 75- The number of 
repeat units N has been already fixed and we know the number of additives of S type. The composition vector 
of the mirror image 713 niust be equal to riR, that is rCs = nji = n, and we must specify the monomer attachment 
probabilities (or the attachment/occlusion probability vector), Eg. p^ . 

The probabilities to form these sequences from these compositions are given by Ea. (l23l24p and hence the joint 
probability to find the enantiomeric pair -^r and 7s is 

Ppair{lR\ls) ^P{1r)p{1s)- (25) 

This is a function of N , min(mr, Ws), n, pk and ps- 



2. Probability to form all possible enantiomeric pairs for fixed N 




For computing the probability of forming all possible enantiomeric pairs, we need to know both mr and TOs and 
which one is greater, since limits on the possible enantiomeric pairs that can be formed come from the enantiomer 
with the least number of guest species. Without loss of generality we may assume that < m^. For fixed N and 
rUr, the number of distinct compositions of the R type copolymers is given by 

_ , , , _ (mr + N)l 

fFN,mr,{no.ni,n2.,...n^^^} — \ ] ]Sf\rn ! ' 

and the number of different sequences that we can form from each individual composition is given by 

p;j« ^ p;— = f iv (27) 

y no,ni,n2,...,n,n^ J 7io!ni!n2!...n„J 
Summing the latter expression over all the possible compositions with fixed N must be equal to the total number of 

different sequences, that is, we obtain the multinomial theorem [s^]: ( ^ ] — ("^r + 1)^- 

\ no, ni,n2, n^^ J 

Recall, the joint probability to form a particular sequence and its mirror image sequence is given by Eq. ()25p . The 
net probability we seek to evaluate is: 

Ppairs{N,mr) = ^ Ppair {l s) ■ (28) 

7-R 

This expression is the probability that each and every possible sequence in R and its mirror image sequence in S are 
formed of fixed length N. For this purpose, we will first sum over all different (but equiprobable) sequences belonging 
to the same composition and then, sum over all different compositions for N repeat units. That is J2 all -sequences — 
T,composzUonsiT,eqmprobabie-sequences)- ^rom Eq. 1^ cach givcu compositiou Can be rearranged in different 
ways. For a given composition, all the sequences that can be made therefrom (re-shuffiings) are equiprobable. Thus 
summing over all these possible rearrangements, we arrive at the probability to form chains and their mirror image 
sequences within one such equiprobable equivalence class, recall rrir < rus'- 

j"=o i=o 

(29) 

Finally, summing this result over all the different compositions, we calculate the net probability to form homochiral 
chains and their mirror image sequences in the system: i.e., the probability to form all possible enantiomeric pairs. 
Thus, the probability that mirror symmetry is not broken for m additives and N repeat units is given by 

Pno break {-^ : — Ppair s {-^ t l^r^ 

E P>hR)p(.ls)- (30) 

no+ni+n2 + ...+nm^=N 
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Then the probabiUty that mirror symmetry is broken for these values of m and N is: 

= 1- E PtP^lR)p{ls) 

= ^ - (proPso +PriPsi + ■■■ +Pr^^Ps^^^ , 

= i-im-Ps)''- (31) 

which follows from the multinomial theorem, [s^l and after using Eas. (|16ll7ll8p . 



B. Chiral additives 



mirror 




P^" equiprobables 



FIG. 10: Pictorial situation for cliiral guest additives. In this case, only additives of one chirality (in S) are added. So rrir — 
and rus = m; with A'^ fixed. Top row (above the mirror) the unique R-type polymer of length A'^ that can be made up from 
the single R-type monomer present in the system. Below the mirror: all the S-type copolymers made up of m distinct S-type 
monomers. Solid vertical line indicates the single unique enantiomeric pair, the dotted lines represents all the non-enantiomeric 
pairs of sequences for rris > 1 



This is a special case of the general one described above, and is pictorially sketched out in Fig. [TOl Here we consider 
rrir = and we set rris = m. Clearly, there is only one possible composition (and hence, sequence) that can be formed 
in R, namely the pure homochiral sequence made up of N repeat units of r: namely = {r, r, rjjv, and this 
forms with unit probability: p{"fR) = 1. Thus, the probability that mirror symmetry is not broken for m types of 
S'-additives and for N repeat units is given by Eq. (pO)) . which simplifies to give: 



Pnobreak{N,m) = Ppairs{N,m) 

= p{ir)p{is) =p{sq)^ ■ 

(32) 

Then the probability that mirror symmetry is broken for these values of m and is: 

^hreak (-^i ^) ^no pairs {J^ i ^) 

= P(7fl)(l -P(7s)) 
= l-p(5o)^. 

(33) 

If the number of S'-type additives goes to zero, TOs — > 0, then p(so) — )• 1 and then mirror symmetry is maintained 
with absolute certainty. 



C. Ideal Racemic Additives 



For our final example, we deal with the case in which all the additives are supplied in ideally racemic proportions, 
that is, we have equal numbers of enantiomer types rrir = rUs = m, and all are supplied in identical concentrations: 
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pno ,ni , . . . ,n„ gqyjpj-Q^ableS 



7s (7I 



(m+l)^-l (m+1)^ 



(m+l)^-l (m+1)^ 

75 Ts 



nn ,ni , . . . ,71^1 



' equiprobables 

FIG. 11: Pictorial scheme for the case of all racemic additives: m,- = iris = m; N is fixed. Compare to Figure [9] 



[rj] = [sj] for all types < j < m = = nis. This situation is pictorially represented in Fig. [TTJ Since the 
activation energies are the same for both R and S enantiomers, then from Eqs. (|14I15I) p{rj) = p{sj) and so from 
Eas. (|23l24p piju) — pijs) = p{l)- AH the sequences that are obtained from re-shuffling the original one have the 
same probability for polymerizing: they define equivalence classes of equiprobable sequences. This number gives the 
number of distinct equivalence classes of equiprobable sequences (Figure llip . An important check on the formalism 
and numerics that follow from it is that the probability for mirror symmetry breaking must go to zero as the number 
m of (ideally racemic) monomer additives goes to zero. We will see that these expectations are confirmed. 

For ideally racemic additives, the probability that mirror symmetry is not broken for m additives and TV repeat 
units is given by Ea. (l30p . which reduces to: 

PnobreakiN,m)^{{pRff- (34) 

Now, from Eq. pip the probability that mirror symmetry is broken for m species of racemic additives and for chain 
length N is 

PhreakiN,m) = l-{{pR)^f. (35) 

This result is important: it says that even for ideally racemic initial proportions in all the host and guest amino acids, 
there is a finite probability 1 > Pbreak {N, m) > for statistical or stochastic breaking of mirror symmetry. This 
mirror symmetry breaking is manifested in the formation of non-enantiomeric pairs of homochiral sequences within 
the template, in support of the proposed experimental scenario [2H. 



V. RESULTS 



From Eqs. (|14m8p the monomer attachment probability vectors pn and ps define the faces of two standard or 
unit TO,, and m^-simplexes [3l| . These simplex faces represent the domains of all allowed monomer attachment 
probabilities, see the shaded regions in Fig. [T^and Fig. [T31 This allows us to find the basic physico-chemical criteria 
for maximizing (or minimizing) the probability for broken mirror symmetry in template-controlled polymerization 
[2ll . [2^ . [25! . For an TO-simplex there is a maximum and a minimum distance from the origin. The maximum distance 
pertains when the attachment probability vector p coincides with one of the to -I- 1 vertices, and in these cases we 
have IIpII = 1. The minimum distance is achieved for the point defined by the centroid of the simplex face located at 
{ ' m+i ' mTTJ' vcctor with (m -I- l)-componcnts) and its modulus is = l/\/m + 1. 



A. General case: non-racemic additives. 



For this general case in which the number of additives of each enantiomer type can be distinct, the attachment 
probability vectors pn and ps have rUr and TOs components, respectively. In principle, they are vectors in simplexes 
of different dimensions. The limits on the number of possible mirror related copolymer pairs that can be formed come 
from the enantiomer with the least number of additives, which, without loss of generality, we take to be to^. 
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(1,0,0) 




FIG. 12: The unit m-simplex, illustrated for the case of m = 2 amino acid additives. The three vertices are located at the points 
(1, 0, 0), (0, 1, 0) and (0,0,1) and correspond to maximum attachment probabilities (mirror symmetry is conserved). Shaded 
area (green) corresponds to the domain of all allowed attachment vectors. A generic point (broken arrow) corresponds to a 
positive probability for symmetry breaking. The centroid (1/3, 1/3, 1/3) (solid arrow), corresponds to the maximum probability 
for mirror symmetry breaking. 



(0,0,0,1) 



(1,0,0,0) 




(0,0,1,0) 



(0,1,0,0) 

FIG. 13: The unit m-simpIex, illustrated for the case of m = 3 additives. The four vertices are located at the points 
(1, 0, 0, 0),(0, 1, 0, 0),(0, 0, 1, 0) and (0,0,0,1) and correspond to maximum attachment probabilities (mirror symmetry is con- 
served). The centroid (blue dot) corresponds to majcimal mirror symmetry breaking. 

Express the vector pn in the rus simplex, by simply redefining pn = {pro,Pri , ■■■,Pr^^ , 0, 0, .., 0} with — rrir zero 
entries, then the probability for mirror symmetry breaking follows from Eq. 1311 

Pbreak{N,mr) = I - ipR ■ PS ) ■ 

(36) 

Now, both attachment vectors can be regarded as belonging to the same mg-simplex. This can be visualized graphically 
as a ms-simplex with one of its subspaces being the m^-simplex (the subspace can be just a point of the rris- 
simplex {rrir = 0), a line {nir = 1), a face (m^ = 2), etc). Fig. UM represents the case in which = 3 and 
rrir = 2, here the subspace of the Ws-simplex corresponding to the rrir-simplex is a face of the tetrahedron. Then the 
probability of breaking symmetry is minimal (zero) when the attachment probability vectors pn and ps are parallel, 
for then pn ■ ps = I and Pbreak{N, rrir) = 0. In order for both vectors to be parallel, both must be in the subspace 
of the mg-simplex that coincides with the m^-simplex. The maximum probability for breaking mirror symmetry 
PbreakiN,mr) = 1 is achieved when the attachment probability vectors are orthogonal and thus coinciding with two 
different vertices of the mg-simplex (see Figure [13]). pr must be in one of the + 1 vertices and ps can be in one of 
the TTLs + 1 vertices, always different from the vertex in which pn lies. In this case, the species in r will be different 
from the species in s, so it will be impossible to form enantiomeric pairs of homochiral chains. 

Typically, copolymers will be formed with lengths ranging from the dimer, trimer, etc. on up to a maximum number 
of repeat units [22, [12, HE] ■ The above arguments apply to any value of N, thus the probability P^^^ki''^) break 
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mirror symmetry in a system containing a spectrum of chain lengths 2 < n < N is given as follows, 

N 



Pbr^aki^r) = J^^'Y^ Pbreak{n,mr) 



{PR-PS)\1~{PR-Psr-^) 



and satisfies Yvmpj^.p^^i P^^aki™) ~ ^ when the two occlusion probability vectors are parallel. 



(37) 



(0,0,1,0) 
(0,0,1) 



(1,0,0,0) 
(1,0,0) 




(0,0,0,1) 



(0,1,0,0) 
(0,1,0) 

FIG. 14: Both the unit rrir and ma-simplexes, illustrated for the particular case of rrir — 2 and ma = 3 additives. The 
three vertices of the mr-simplex are located at the points (1,0, 0),(0, 1,0) and (0,0, 1) and coincide to maximum attachment 
probabilities of r. These three vertices of the m, -simplex also coincide with three of the four vertices of the ms-simplex, located 
at the points (1, 0, 0, 0),(0, 1, 0, 0),(0, 0, 1, 0) and (0, 0, 0, 1) and corresponding to maximum attachment probabilities of s 



B. Additives of only one handedness. 

This is a particular case of the above for TOs, when = 0. The probability of breaking mirror symmetry 

depends only on p{sq): the attachment probability of the S'-enantiomer of the majority species. Each monomer 
attachment vector is in a different simplex with different dimensions, ps is in a ms-simplex and pR will coincide with 
a vertex of the TOg-simplex. 

The minimal probability of breaking symmetry PhreakiN^m) = is obtained for p{sq) — 1, in this case, there are 
no guests, only the majority species 5*0, so we recover the case in which additives are supplied in racemic proportions, 
and moreover, no guests are added. In this case, the attachment probability vector ps coincides with one of the 
m + 1 vertices, the vertex corresponding to jjr and to p{so) maximum. The maximum probability of breaking mirror 
symmetry PhreakiN,m) = 1 is obtained for p{so) = 0. In this case, the majority species in S, is absent, thus no 
possible enantiomeric pairs can be formed, that is, the vector ps can lie anywhere in the rris-simplex, except at the 
rrir vertex. 

In this case, the probability P^^aki''^) break mirror symmetry in a system containing a spectrum of chain lengths 
2 < n < iV Eq. ^ reduces to 

n.eafcl™i - (7V_l)(l_p(so)) ' ^^^^ 

and satisfies limp(s(,)_j.i P^^aki^) ~ ^ when no majority specie of the S-type is supplied. 

The cases with two majority species r and s and one guest, s', with starting fractions fr : fs : fs', as considered in 
the first section of the paper, would be a case of additives of only one handedness or chiral additive, where rrir — and 
nis = m — 1. Following Eq. ((33)) we can calculate P^^aki''^) ^'^'^ three different starting compositions considered 
before. Exemplary numerical results are shown in the Tables (jVI[) . (jVII[) and in Figure [T5] showing the effect of varying 
the relative concentrations of all the monomers and the activation energy (we vary w'^) of the guest monomer s' . The 
curves for Ptreak are qualitatively similar to those of the percent ee in Figure [3] 
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TABLE VI: Wr — Ws = ui^/ = 1 Probability to break mirror symmetry, -Pb7eafc("^)' ^'^^ ^^e three different starting composition 
fr ■ fs '■ fa' of the three component case {rrir = and = m = 1) as a function of repeat units A'^ 

P^ff^fe(m) iV = 5 iV = 10 TV = 15 iV = 20 TV = 25 TV = 30 

0.5 : 0.25 : 0.25 0.88 0.94 0.96 0.97 0.98 0.98 

0.5 : 0.45 : 0.05 0.30 0.45 0.55 0.63 0.69 0.73 

0.5 : 0.475 : 0.025 0.16 0.26 0.34 0.41 0.47 0.52 



TABLE VII: Wr = Wa = l,Wai = 0.75 Probability to break mirror symmetry, f'b7.eafc("T-), for the three different starting 
composition fr : fa '■ fa' of the three component case {rrir — and ma = m = 1) as a function of repeat units TV 





TV = 5 


TV = 10 


TV = 15 


TV = 20 


TV = 25 


TV = 30 


0.5 : 0.25 : 0.25 


0.83 


0.92 


0.95 


0.96 


0.97 


0.97 


0.5 : 0.45 : 0.05 


0.24 


0.37 


0.47 


0.54 


0.61 


0.66 


0.5 : 0.475 : 0.025 


0.12 


0.22 


0.27 


0.33 


0.39 


0.43 



C. Racemic additives. 



When the enantiomeric species are provided in ideally racemic proportions, the probability that mirror symmetry 
is broken for given values of the chain length TV and number of species m can be expressed succinctly as 

Pbreak{N,m) = l-\\p\\^^, (39) 

that is, one minus the squared-modulus of the probability attachment vector p, raised to the chain length. Thus, 
for fixed TV, in order to maximize the probability that mirror symmetry be broken, we should prepare the chemical 
system so that all m additives and the majority species have equally shared mole fractions. For any other point in the 
face (including the centroid), but excluding the m + 1 vertices, then < 1, hence the probability to break mirror 
symmetry increases with chain length TV and/or with increasing number of additives to, provided these are supplied 
with small mole fractions (to prevent p from coinciding with the vertices) . 

Finally, if the occlusion probability vector p coincides with any one of the to + 1 vertices, then = 1, so 
PbreakiN, m) = 0, and mirror symmetry is maintained with absolute certainty for all TV. Each vertex corresponds to 
a chemical system with only one type of monomer (and its enantiomer), in other words, a system with no additives. 
The m-th vertex corresponds to the m-th amino acid being the sole species present in the system. Thus we see that 
if we increase the mole fraction of any one of the additives in excess, the tables are turned, and the majority and 
minority species interchange their roles: excessive amounts of any additive tend to reduce the probability for breaking 
mirror symmetry. 

Eq. (|37p now simplifies to give 



.<N _ , | |p1|^(l-||plp(^-l)) 



and satisfies ^;,r^fc('^) = at the vertices of the simplex. As expected, we find increasing probability for 

symmetry breaking as TV and/or m increase. A comparison of the two Tables confirms that the probabilities are 
maximized for each TV and to, when all species are supplied in equal proportions. The probability to break mirror 
symmetry is strictly zero when there are no additives: P^^aki''^ = 0) = 0. 

The cases with two majority species r and s and two guests, r' and s', with starting fractions /,■ : fr' : fs ■ fs', as 
considered in the first section of the paper, is a case of racemic additives where rrir = "nis — 1- Following Eg. psp we 
can calculate P/^^aki^) three different starting compositions considerer before. The results (not shown) are 

qualitatively very similar to to those shown in the previous tables. 



VI. CONCLUSIONS 

The proposed [llj experimental mechanism leads to the formation of homochiral copolymers with random sequences 
of the majority and minority amino acids. Given the implications of the experimental mechanism, we have provided 
two independent and complementary theoretical approaches to the problem. The first one is based on approximate 
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chemical equilibrium, the second, on statistical principles and combinatorics. Both these approaches provide further 
quantitative insights into the template-controlled induced desymmetrization mechanisms advocated by Lahav and 
coworkers |i20h26,] . 

In the first approach, appealing to chemical equilibrium, the template or beta sheet is in approximate equilibrium 
with the free monomer pool. We obtain a multinomial sample space for the distribution of equilibrium concentrations 
of the homochiral copolymers. We then deduce mass balance equations for the cnantiomers of the individual amino 
acid species, and their numerical solutions are used to evaluate the sequence-dependent copolymer concentrations, in 
terms of the total species concentrations. Measurable quantities signalling the degree of mirror symmetry breaking 
such as the enantiomeric excess ee, relative abundances and average chain lengths are evaluated as functions of initial 
monomer concentrations and the individual equilibrium constants. We can take these constants as large as desired to 
approximate irreversible polymerization. 

The second, or probabilistic, description confirms that this is a viable mechanism for stochastic mirror symmetry 
breaking. We give criteria for the chemical conditions leading to either maximal or minimal probabilities for breaking 
mirror symmetry in this experimental context. The probability for finding non- enantiomeric pairs of copeptide chains 
of different sequences increases as a function of increasing chain length and increasing number of guest amino acids. 
We can calculate the probabilities of all these joint outcomes in terms of the basic monomer attachment/occlusion 
probabilities. These probabilities can be calculated in terms of the monomer concentrations and take into account 
the fact that different amino acids have different polymerization activation energies. The solution of the full problem 
admits an appealing visual and geometric interpretation in terms of unit-simplexes which summarize the allowed 
relative polymerization rates and concentrations of the different amino acids involved. 

There are two important points worth emphasizing. First, our theoretical models invoke the underlying template 
control in that they do not allow for any heterochiral oligomers to form. The sequence of the host and guest amino 
acids within the homochiral peptides assembles in a completely random fashion, in accord with the experiments [2l| . 
This sequence randomness is captured by both the model based on chemical equilibrium and by the second model 
based on the monomer occlusion probabilities. Secondly, the statistical/combinatorial effects do lead to a stochastic 
mirror symmetry breaking effect. The symmetry breaking in these experiments arises from combinatorics, not from 
spontaneous (bifurcation) phenomena. These stochastic/statistical/combinatorial effects are not due to the inherent 
tiny chiral fiuctuations present in all real chemical systems [39, 40, 43] but are due rather to the random occlusion of 
host and guest amino acids by the chiral sites of the template: the mechanisms proposed here work even for ideally 
racemic mixtures. Mirror symmetry is broken in the sequences, as non-enantiomeric pairs of oligomers are formed. 
The solution of free monomers can nevertheless be optically inactive. The symmetry breaking is to be found in the 
template, or /3-sheet, but not (necessarily) in the solution. 

An important distinction must be drawn between the types of symmetry breaking/amplification treated in this paper. 
Whereas in the first part (SecO]]) treats the global system symmetry (that thus can lead to global chiral effects), the one 
described in this latter part (Sec [TV|) concerns local asymmetries (specific all-R versus all-S amino acid sequences), that 
could be invisible at the global scale. It is not guaranteed that one asymmetry will imply the other. 

The experiments [IJl motivating the present study shed valuable light on the role of templates in the origin of 
homochiral peptides. Recent works have discussed the potential roles of peptide (amino acid) /3-sheets in the origin of 
life, underscoring their effective protection against decomposition and racemization (recovery of mirror symmetry) as 
well as their catalytic ability towards hydrolysis p33l . [34[ . An experimental demonstration of the formation of /3-sheets 
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that serve as catalysts for peptide condensation and self-replication has been reported recently 'ssj. Other groups 
have demonstrated that /3-sheet templates can affect enzyme assisted amino acid polymerization |36i] , and replication 
of cylic peptide structures [s^- Such templates might have enjoyed a considerable enantioselective advantage in a 
prebiotic environment ISS^]. 

In closing, we note that the symmetry breaking mechanism of Lahav and coworkers (2ll . [25| has some features in 
common with the qualitative scenarios of Green, Eschenmoser and Siegel in which a deficient or limited supply of 
material results in a stochastic symmetry breaking process [4l| - |4^ . 
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